representative agent
LLM-Guided Reinforcement Learning with Representative Agents for Traffic Modeling
Large language models (LLMs) are increasingly used as behavioral proxies for self-interested travelers in agent-based traffic models. Although more flexible and generalizable than conventional models, the practical use of these approaches remains limited by scalability due to the cost of calling one LLM for every traveler. Moreover, it has been found that LLM agents often make opaque choices and produce unstable day-to-day dynamics. To address these challenges, we propose to model each homogeneous traveler group facing the same decision context with a single representative LLM agent who behaves like the population's average, maintaining and updating a mixed strategy over routes that coincides with the group's aggregate flow proportions. Each day, the LLM reviews the travel experience and flags routes with positive reinforcement that they hope to use more often, and an interpretable update rule then converts this judgment into strategy adjustments using a tunable (progressively decaying) step size. The representative-agent design improves scalability, while the separation of reasoning from updating clarifies the decision logic while stabilizing learning. In classic traffic assignment settings, we find that the proposed approach converges rapidly to the user equilibrium. In richer settings with income heterogeneity, multi-criteria costs, and multi-modal choices, the generated dynamics remain stable and interpretable, reproducing plausible behavioral patterns well-documented in psychology and economics, for example, the decoy effect in toll versus non-toll road selection, and higher willingness-to-pay for convenience among higher-income travelers when choosing between driving, transit, and park-and-ride options.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Switzerland (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (4 more...)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Leisure & Entertainment > Games (1.00)
Efficient and Scalable Deep Reinforcement Learning for Mean Field Control Games
Mean Field Control Games (MFCGs) provide a powerful theoretical framework for analyzing systems of infinitely many interacting agents, blending elements from Mean Field Games (MFGs) and Mean Field Control (MFC). However, solving the coupled Hamilton-Jacobi-Bellman and Fokker-Planck equations that characterize MFCG equilibria remains a significant computational challenge, particularly in high-dimensional or complex environments. This paper presents a scalable deep Reinforcement Learning (RL) approach to approximate equilibrium solutions of MFCGs. Building on previous works, We reformulate the infinite-agent stochastic control problem as a Markov Decision Process, where each representative agent interacts with the evolving mean field distribution. We use the actor-critic based algorithm from a previous paper (Angiuli et.al., 2024) as the baseline and propose several versions of more scalable and efficient algorithms, utilizing techniques including parallel sample collection (batching); mini-batching; target network; proximal policy optimization (PPO); generalized advantage estimation (GAE); and entropy regularization. By leveraging these techniques, we effectively improved the efficiency, scalability, and training stability of the baseline algorithm. We evaluate our method on a linear-quadratic benchmark problem, where an analytical solution to the MFCG equilibrium is available. Our results show that some versions of our proposed approach achieve faster convergence and closely approximate the theoretical optimum, outperforming the baseline algorithm by an order of magnitude in sample efficiency. Our work lays the foundation for adapting deep RL to solve more complicated MFCGs closely related to real life, such as large-scale autonomous transportation systems, multi-firm economic competition, and inter-bank borrowing problems.
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)
Accelerating Hybrid Agent-Based Models and Fuzzy Cognitive Maps: How to Combine Agents who Think Alike?
Giabbanelli, Philippe J., Beerman, Jack T.
While Agent-Based Models can create detailed artificial societies based on individual differences and local context, they can be computationally intensive. Modelers may offset these costs through a parsimonious use of the model, for example by using smaller population sizes (which limits analyses in sub-populations), running fewer what-if scenarios, or accepting more uncertainty by performing fewer simulations. Alternatively, researchers may accelerate simulations via hardware solutions (e.g., GPU parallelism) or approximation approaches that operate a tradeoff between accuracy and compute time. In this paper, we present an approximation that combines agents who `think alike', thus reducing the population size and the compute time. Our innovation relies on representing agent behaviors as networks of rules (Fuzzy Cognitive Maps) and empirically evaluating different measures of distance between these networks. Then, we form groups of think-alike agents via community detection and simplify them to a representative agent. Case studies show that our simplifications remain accuracy.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > United States > Hawaii (0.05)
- Europe > Switzerland (0.04)
- (4 more...)
Unified continuous-time q-learning for mean-field game and mean-field control problems
Wei, Xiaoli, Yu, Xiang, Yuan, Fengyi
This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function by different means to learn the mean-field equilibrium policy or the mean-field optimal policy respectively. As a result, we devise a unified q-learning algorithm for both MFG and MFC problems by utilizing all test policies stemming from the mean-field interactions. For several examples in the jump-diffusion setting, within and beyond the LQ framework, we can obtain the exact parameterization of the decoupled Iq-functions and the value functions, and illustrate our algorithm from the representative agent's perspective with satisfactory performance.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Asia > China > Hong Kong > Kowloon (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- Research Report (0.69)
- Overview (0.47)
Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning
Jusup, Matej, Pásztor, Barna, Janik, Tadeusz, Zhang, Kenan, Corman, Francesco, Krause, Andreas, Bogunovic, Ilija
Many applications, e.g., in shared mobility, require coordinating a large number of agents. Mean-field reinforcement learning addresses the resulting scalability challenge by optimizing the policy of a representative agent interacting with the infinite population of identical agents instead of considering individual pairwise interactions. In this paper, we address an important generalization where there exist global constraints on the distribution of agents (e.g., requiring capacity constraints or minimum coverage requirements to be met). We propose Safe-M$^3$-UCRL, the first model-based mean-field reinforcement learning algorithm that attains safe policies even in the case of unknown transitions. As a key ingredient, it uses epistemic uncertainty in the transition model within a log-barrier approach to ensure pessimistic constraints satisfaction with high probability. Beyond the synthetic swarm motion benchmark, we showcase Safe-M$^3$-UCRL on the vehicle repositioning problem faced by many shared mobility operators and evaluate its performance through simulations built on vehicle trajectory data from a service provider in Shenzhen. Our algorithm effectively meets the demand in critical areas while ensuring service accessibility in regions with low demand.
- North America > United States (0.28)
- Asia > China > Guangdong Province > Shenzhen (0.25)
- Europe > Switzerland > Zürich > Zürich (0.14)
- (5 more...)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Energy (0.92)
- (2 more...)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning
Pasztor, Barna, Bogunovic, Ilija, Krause, Andreas
Learning in multi-agent systems is highly challenging due to the inherent complexity introduced by agents' interactions. We tackle systems with a huge population of interacting agents (e.g., swarms) via Mean-Field Control (MFC). MFC considers an asymptotically infinite population of identical agents that aim to collaboratively maximize the collective reward. Specifically, we consider the case of unknown system dynamics where the goal is to simultaneously optimize for the rewards and learn from experience. We propose an efficient model-based reinforcement learning algorithm $\text{M}^3\text{-UCRL}$ that runs in episodes and provably solves this problem. $\text{M}^3\text{-UCRL}$ uses upper-confidence bounds to balance exploration and exploitation during policy learning. Our main theoretical contributions are the first general regret bounds for model-based RL for MFC, obtained via a novel mean-field type analysis. $\text{M}^3\text{-UCRL}$ can be instantiated with different models such as neural networks or Gaussian Processes, and effectively combined with neural network policy learning. We empirically demonstrate the convergence of $\text{M}^3\text{-UCRL}$ on the swarm motion problem of controlling an infinite population of agents seeking to maximize location-dependent reward and avoid congested areas.
- Research Report (1.00)
- Overview (0.92)
Entropy Regularization for Mean Field Games with Learning
Guo, Xin, Xu, Renyuan, Zariphopoulou, Thaleia
Entropy regularization has been extensively adopted to improve the efficiency, the stability, and the convergence of algorithms in reinforcement learning. This paper analyzes both quantitatively and qualitatively the impact of entropy regularization for Mean Field Game (MFG) with learning in a finite time horizon. Our study provides a theoretical justification that entropy regularization yields time-dependent policies and, furthermore, helps stabilizing and accelerating convergence to the game equilibrium. In addition, this study leads to a policy-gradient algorithm for exploration in MFG. Under this algorithm, agents are able to learn the optimal exploration scheduling, with stable and fast convergence to the game equilibrium.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Texas (0.14)
- North America > United States > California (0.14)
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications
Perrin, Sarah, Perolat, Julien, Laurière, Mathieu, Geist, Matthieu, Elie, Romuald, Pietquin, Olivier
In this paper, we deepen the analysis of continuous time Fictitious Play learning algorithm to the consideration of various finite state Mean Field Game settings (finite horizon, $\gamma$-discounted), allowing in particular for the introduction of an additional common noise. We first present a theoretical convergence analysis of the continuous time Fictitious Play process and prove that the induced exploitability decreases at a rate $O(\frac{1}{t})$. Such analysis emphasizes the use of exploitability as a relevant metric for evaluating the convergence towards a Nash equilibrium in the context of Mean Field Games. These theoretical contributions are supported by numerical experiments provided in either model-based or model-free settings. We provide hereby for the first time converging learning dynamics for Mean Field Games in the presence of common noise.
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- Leisure & Entertainment > Games (1.00)
- Energy (1.00)
- Information Technology (0.68)
- Information Technology > Game Theory (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Towards a Characterization of Explainable Systems
Bohlender, Dimitri, Köhl, Maximilian A.
Building software-driven systems that are easily understood becomes a challenge, with their ever-increasing complexity and autonomy. Accordingly, recent research efforts strive to aid in designing explainable systems. Nevertheless, a common notion of what it takes for a system to be explainable is still missing. To address this problem, we propose a characterization of explainable systems that consolidates existing research. By providing a unified terminology, we lay a basis for the classification of both existing and future research, and the formulation of precise requirements towards such systems.
- North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
- North America > United States > California > Monterey County > Monterey (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (4 more...)